Order Matters: Sequence to sequence for sets

نویسندگان

  • Oriol Vinyals
  • Samy Bengio
  • Manjunath Kudlur
چکیده

Sequences have become first class citizens in supervised learning thanks to the resurgence of recurrent neural networks. Many complex tasks that require mapping from or to a sequence of observations can now be formulated with the sequence-to-sequence (seq2seq) framework which employs the chain rule to efficiently represent the joint probability of sequences. In many cases, however, variable sized inputs and/or outputs might not be naturally expressed as sequences. For instance, it is not clear how to input a set of numbers into a model where the task is to sort them; similarly, we do not know how to organize outputs when they correspond to random variables and the task is to model their unknown joint probability. In this paper, we first show using various examples that the order in which we organize input and/or output data matters significantly when learning an underlying model. We then discuss an extension of the seq2seq framework that goes beyond sequences and handles input sets in a principled way. In addition, we propose a loss which, by searching over possible orders during training, deals with the lack of structure of output sets. We show empirical evidence of our claims regarding ordering, and on the modifications to the seq2seq framework on benchmark language modeling and parsing tasks, as well as two artificial tasks – sorting numbers and estimating the joint probability of unknown graphical models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Minimizing the total tardiness and makespan in an open shop scheduling problem with sequence-dependent setup times

We consider an open shop scheduling problem with setup and processing times separately such that not only the setup times are dependent on the machines, but also they are dependent on the sequence of jobs that should be processed on a machine. A novel bi-objective mathematical programming is designed in order to minimize the total tardiness and the makespan. Among several mult...

متن کامل

Completeness results for metrized rings and lattices

The Boolean ring $B$ of measurable subsets of the unit interval, modulo sets of measure zero, has proper radical ideals (for example, ${0})$ that are closed under the natural metric, but has no prime ideal closed under that metric; hence closed radical ideals are not, in general, intersections of closed prime ideals. Moreover, $B$ is known to be complete in its metric. Togethe...

متن کامل

Evaluation of Bi-objective Scheduling Problems by FDH, Distance and Triangle Methods

In this paper, two methods named distance and triangle methods are extended to evaluate the quality of approximation of the Pareto set from solving bi-objective problems. In order to use evaluation methods, a bi-objective problem is needed to define. It is considered the problem of scheduling jobs in a hybrid flow shop environment with sequence-dependent setup times and the objectives of minimi...

متن کامل

On Lacunary Statistical Limit and Cluster Points of Sequences of Fuzzy Numbers

For any lacunary sequence $theta = (k_{r})$, we define the concepts of $S_{theta}-$limit point and $S_{theta}-$cluster point of a sequence of fuzzy numbers $X = (X_{k})$. We introduce the new sets  $Lambda^{F}_{S_{theta}}(X)$, $Gamma^{F}_{S_{theta}}(X)$ and prove some inclusion relaions between these and the sets $Lambda^{F}_{S}(X)$, $Gamma^{F}_{S}(X)$ introduced in ~cite{Ayt:Slpsfn} by Aytar [...

متن کامل

Evaluation of Bi-objective Scheduling Problems by FDH, Distance and Triangle Methods

In this paper, two methods named distance and triangle methods are extended to evaluate the quality of approximation of the Pareto set from solving bi-objective problems. In order to use evaluation methods, a bi-objective problem is needed to define. It is considered the problem of scheduling jobs in a hybrid flow shop environment with sequence-dependent setup times and the objectives of minimi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1511.06391  شماره 

صفحات  -

تاریخ انتشار 2015